Irreversibility in bacterial regulatory networks

Irreversibility, in which a transient perturbation leaves a system in a new state, is an emergent property in systems of interacting entities. This property has well-established implications in statistical physics but remains underexplored in biological networks, especially for bacteria and other prokaryotes whose regulation of gene expression occurs predominantly at the transcriptional level. Focusing on the reconstructed regulatory network of Escherichia coli, we examine network responses to transient single-gene perturbations. We predict irreversibility in numerous cases and find that the incidence of irreversibility increases with the proximity of the perturbed gene to positive circuits in the network. Comparison with experimental data suggests a connection between the predicted irreversibility to transient perturbations and the evolutionary response to permanent perturbations.


Summary
This Supplementary Material file contains details regarding (i) the consistency of the irreversibility estimates across realizations, (ii) a characterization of the nodes not exhibiting irreversibility, (iii) a discussion of the necessity of positive circuits for irreversibility, (iv) conditions necessary for the preservation of synchronous partial fixed points under asynchronous updates, (v) an examination of the network motifs responsible for certain periodic attractors, (vi) a comparison with alternative formulations of the ensemble of rules, (vii) a comparison with previous studies investigating bacterial heterogeneity, (viii) a comparison of adaptive evolution responses to crp KO with irreversible response genes in the Boolean model, and (ix) a detailed description of how to use our results to obtain specific experimental predictions.

Irreversibility across realizations of the update rules
We demonstrate that the averages for the irreversibility estimated from different samplings of the rules converge to a common value.Figure S2 presents the distributions of the RMSD for all considered parameter pairs with nonunique rules for ascending and descending input sortings (i.e., diffuse and concentrated control) in panels A and B, respectively.The plots in each row are arranged in order of decreasing rule bias [cf., Fig. 4A].Importantly, neither the rule bias nor the average canalization depth appears to alter the convergence.
We observe that the RMSD converges to approximately 0.1 in probability for two independent ensembles of M 0 = 10 realizations, with slightly higher values observed in ascending order compared to descending order.These results establish that M = 20 realizations are sufficient to obtain reliable estimates of the average irreversibility.In the main text, the ensemble averages over realizations are compared with irreversibility for unique rules, which do not require averaging.
In addition to considering the impact of update rules, we also examined the impact of changing the point of the attractor in which the perturbation was reverted in the case of periodic attractors.Throughout the paper, we assess irreversibility by applying the perturbation to the first point of each attractor recorded by the SAT algorithm (37).Similarly, we revert the perturbation in the first point of the perturbed trajectory that is on the new attractor.We argue that the impact of this choice is small, which we confirm by comparing with the irreversibility observed when reverting the perturbation of the closest point of the new attractor to the initial state in terms of Hamming distance.We find that reverting from this state causes 2.1% of irreversible transitions to become reversible.

Nodes not exhibiting irreversibility
The algorithm that generates the core network G 0 recursively trims nodes that are trivial SCCs, so that the core network contains all irreversible perturbations.This is the case because if all inputs of an SCC consisting of a single node v with no autoregulation remain unchanged after a transient perturbation, then x v must remain unchanged as well.Thus, it can be concluded that the trimming will not alter the number of fixed points in the network, which is consistent with known necessary conditions for multi-stationarity (38)(39)(40)61).Now focusing on the core, Fig. 5 shows that 36 nodes within G 0 do not admit irreversible perturbations across all realizations of the rules in our simulations.We examine these nodes on a case-by-case basis and show that they cannot be irreversible by directly demonstrating that the final state must belong to the same attractor as the initial state.There are two cases: 1. Reversibility of leaf nodes.Consider a leaf node u, which by definition has no edges to other nodes.Such a node is necessarily reversible because it does not influence other nodes and is restored to its original state upon removal of the transient perturbation.A total of 32 nodes are in this class-aidB, araC, asnC, betI, cadC, cusR, dnaA, dpiA, fucR, glcC, glnG, hyfR, idnR, lldR, lsrR, malI, melR, metR, mraZ, nhaR, nikR, pdeL, prpR, purR, putA, puuR, rbsR, tdcA, yeiL, yiaJ, yqjI, and zraR.

2.
Reversibility of nodes exclusively regulating autorepressive leaf nodes.The genes metJ, pdhR, argP, and nac have a single regulatory output to an autorepressive leaf node.Either the regulatory node canalizes the output of the leaf node or the leaf node's autoregulatory input (which is by definition 0) overrides the other input.In the former case, any change to the state of the leaf node caused by the perturbation of the regulatory node is restored by its reversion, whereas in the latter case the state of the leaf node never changes.

Necessity of positive circuits for irreversibility
In the main text, we focus on positive circuits and claim that these circuits underlie the multistability necessary for irreversibility.Here we justify this claim.We first observe that the edge consistency condition implies the existence of at least one circuit in the network.The edge consistency condition requires that each node in the network has an incident edge (this assumes the existence of autoregulatory edges in nodes without input edges from other nodes).It follows that in a connected component of |V | nodes, starting at an arbitrary node and following incident edges backwards |V | times, we necessarily visit a node more than once.This implies the existence of a circuit.
We next observe that circuits are necessary for multistability and, by implication, for irreversibility.This is because nodes not belonging to circuits are trivial SCCs and thus are not irreversible response nodes according to the argument in the previous section.The phoB origon core network has numerous positive circuits, guaranteeing multiple stable attractors.
Irreversibility occurs between two attractors.To orient the discussion, we note that in our simulations, 42.2% of the attractors are period-1 attractors, 24.4% are period-2 attractors, 1.1% are period-3 attractors, 27.9% are period-4 attractors, and 4.4% are higher-period attractors.For brevity, we use the language periodic attractors to refer to period-n attractors with n > 1, while we continue to use fixed points to refer to period-1 attractors.Irreversibility may involve a transition between two fixed-point attractors, two periodic attractors, or a fixed-point attractor and a periodic attractor.
We first consider the case of irreversibility arising from a transition between two fixed-point attractors, noting that 49.0% of transitions in our simulations are of this type.We observe that fixed-point attractors cannot arise from negative circuits.In a network consisting of a negative circuit, the fixed-point condition x = B(x) implies that the state x v would be negated an odd number of times in the process of applying the update rules (38), which would require that x v = xv (we recall that the overbar indicates negation).Since this is impossible, no fixed-point attractors can exist for a network consisting of a negative circuit.In the absence of positive circuits, application of the fixed-point condition to a network including any number of negative circuits will yield one or more contradictions of the form x v = xv , which again exclude the possibility of fixed-point attractors.In the absence of positive circuits, the exceptions are: (i) negative loops consisting of a single node with an autorepressive edge, which is fixed but monostable, and (ii) negative circuits that remain fixed due to inputs from upstream nodes, which precludes the possibility of multistability.Thus, irreversibility between fixed points requires positive circuits.
We note that periodic attractors in which all nodes are time-dependent are absent in our simulations.Thus, transitions in which one or more attractors are periodic involve partial fixed points, which we define as time-dependent attractors in which the state of one or more nodes are time-independent.A total of 31.7% of irreversibility transitions in our simulations occur between partial fixed-point attractors with all differences taking place among the time-independent nodes.An additional 2.2% involve a transition between a fixed-point and a partial fixed-point attractor.The remaining transitions (17.1%) occur between partial fixed-point attractors with different numbers of fixed nodes, meaning that a subset of the time-independent nodes transition to being time-dependent and/or vice versa.In all cases, the differences between attractors involve a node that is time-independent in at least one of the attractors.We observe that the arguments of the previous paragraph apply in the sense that positive circuits determine the time-independent nodes in partial fixed-point attractors.
Thus in all transitions, the attractors are determined (at least in part) by positive circuits.
We expect the time-independent portions of the initial and final attractors observed in our simulations to be preserved under asynchronous updates in the large majority of cases.This is the case because fixedpoint attractors are preserved under asynchronous updates (62), implying that the initial and final states in the 49.0% of transitions between fixed-point attractors will exist in asynchronous updates as well.In addition, we expect the time-independent nodes of partial fixed points to be preserved in a further 16.2% of transitions, because a synchronous partial fixed point has a corresponding asynchronous partial fixed point under the conditions discussed below.Finally, the initial and final attractors in transitions involving one fixed point and one partial fixed point are preserved, and they are guaranteed to remain distinct when the partial fixed point has time-independent nodes whose state differs from those of the fixed point.The latter accounts for 2.6% of all transitions.Together, 67.9% of transitions for synchronous updates have the initial and final attractors preserved under asynchronous updates.
The time-dependent portions of partial fixed-point attractors associated with negative circuits show acyclic behavior under asynchronous updates.This is because the periodicity of an attractor generally requires the concurrent update of multiple nodes at each time step, while only one (randomly chosen) node is updated at a time under asynchronous updates.However, we did not observe transitions between attractors in which the only difference occurs between time-dependent nodes (i.e., attractors that share the same set of timeindependent nodes and that have identical states on these nodes) in our simulations.

Preservation of synchronous partial fixed points
Asynchronous partial fixed points imply the existence of synchronous partial fixed points (63) but not vice versa.Concerning the forward implication, the states of time-independent nodes of asynchronous partial fixed points are, by definition, guaranteed to remain fixed for all possible states of the time-dependent nodes, and thus under synchronous updates.The reverse implication does not hold because the asynchronous updates cause the time-dependent nodes to reach states outside of their synchronous periodic orbit, making it possible for the downstream time-independent nodes to change.
Notwithstanding, we identify a set of sufficient conditions under which synchronous partial fixed points have corresponding asynchronous partial fixed points.For the state of a time-independent node in a synchronous partial fixed point to change under asynchronous updates, the node must have more than one upstream input that is time-dependent.Otherwise, the time evolution on the attractor is sufficient to conclude that the time-independent state will be stable for all possible values of the time-dependent nodes.The requirement for multiple time-dependent inputs is established by as follows.We refer to the set of time-dependent nodes in the attractor as B 0 and the set of nodes incident on u in G 0 as I u .Then, the set of nodes with at least 2 and iterate the calculation until dB i is empty.Thus, the time-independent nodes not in B i are preserved under asynchronous updates.As a corollary, this implies that |dB 0 | = 0 is a sufficient condition to conclude that synchronous partial fixed points imply the existence of corresponding synchronous partial fixed points.This condition is met in 82.5% of the period-2 attractors, 88.9% of the period-3 attractors, 14.4% of the period-4 attractors, and 37.6% of the higher-period attractors of our simulations.Together, the time-independent nodes of partial fixed-point attractors are guaranteed to be preserved under asynchronous updates in at least 46.3% of cases.(This is calculated by multiplying the frequency of each periodic attractor times its rate of preservation and normalizing by the total number of periodic attractors).If the initial and final attractors in irreversible transitions were uncorrelated, then 21.4% of transitions between partial fixed-point attractors would be preserved under asynchronous updates.We find that 51.2% of transitions between partial fixed points have initial and final attractors that preserved in our simulations.This percentage applies to the 31.7% of transitions between partial fixed points for which both the initial and final states have the same set of time-dependent nodes, yielding a total of 16.2% of all transitions as referenced above.

Motifs generating periodic attractors
The analysis of partial fixed points motivates us to investigate the network structures behind the period-2 and period-4 attractors, which together account for over half of the attractors in our simulations.We attribute the prevalence of period-2 and period-4 attractors specific motifs in the network.Period-2 attractors occur mainly among gene pairs that exhibit mutual negative feedback such as (galR, galS), (mlc, ptsG), (exuR, uxuR), and (mazE, mazF).Period-4 attractors occur among gene pairs in which the first represses the second and the second activates the first, and one of the genes has no autoregulation, as in the cases of (arcA, fnr) and (hns, cspA).These pairs can have a period-4 orbit (1, 0) !(0, 0) !(0, 1) !(1, 1) !(1, 0) that cascades to downstream nodes.

Comparison with alternative formulations
Here, we offer evidence that the conclusions of the paper apply to the transcriptional regulatory network of E. coli in general, even though we focus specifically on the phoB origon (the largest origon) and canalizing rules.Specifically, we examine how changing the origon, the rules, and/or the mathematical formulation of the dynamics would affect the results.
Different origons.In Table S1, we report the statistics of all origons with core size 30 in our RegulonDB model (the remaining origons have core sizes  5).For our purposes, the most relevant measures are the overlaps of the core phoB origon with those of the alternative origons, as well as the overlaps of nodes in the nontrivial SCCs of the phoB origon with those of the alternative origons.As demonstrated in the main text, the ability of a node to admit an irreversible perturbation is related to its proximity to SCCs with positive circuits (in the general case, nontrivial SCCs can be made up of positive and/or negative circuits).Our model shows that most such SCCs of the largest origons are contained within the phoB origon core network (see overlap column in Table S1).Accordingly, the identity and irreversibility probability of nodes that admit irreversible perturbations in our analysis of the phoB origon core network are inclusive of almost all cases that would be found when considering all largest origons.
Threshold-based rules.One alternative to canalizing dynamics is threshold-based dynamics (31), in which the update rules for each node are where u 2 [0, 1] are activation thresholds.Specializing to the case of a uniform threshold u = , the limit !0 corresponds to the case (r, s) = (0, 1) (i.e., all inputs joined by OR operators) whereas the limit a ! 1 corresponds to (r, s) = (0, 0) (i.e., all inputs joined by AND operators).The sorting of inputs y v i by k v i when determining canalization also has a relationship to threshold dynamics with heterogeneous a u .In particular, if v i / k v i , then nodes with small k v i will be easier to activate.The ease of activation of these nodes makes this case analogous to the ascending input sorting in the limit that s ! 1 when r = 0. Other heterogeneous schemes with u drawn uniformly from [0, 1] would correspond most closely to the cases of (r, s) = (0.4, 1) and (r, s) = (0.6, 1) in our simulations.
Bias-based rules.Bias-based rules are a formulation of random Boolean networks in which the update rules are generated by randomly assigning 1 (with probability ⌫) or 0 to each possible input vector.The parameter ⌫ is equivalent to the rule bias used in our analysis.This formulation is mathematically convenient for examining the effect of topology on the stability of Boolean networks (29,30).Bias-based rules apply in the case of treelike graphs in which the activation probabilities of nodes may be regarded as independent, and they do not rely on information about the polarity of interactions.In this work, we operate in the opposite limit: we focus specifically on the core network, where the coregulation of nodes produces correlated activations, and our empirical network allows us to explicitly incorporate edge polarity into the model.
Alternative dynamical formulations.We argue that the predictions of the model apply beyond the case of Boolean rules that are synchronously updated.Related conclusions are expected for asynchronous updates since fixed points are known to be preserved (62) and, as shown above, most partial fixed-points are also preserved under asynchronous update rules.Similar conclusions apply to continuous representations of the dynamics because the rules can be transformed from a discrete to a continuous representation of the state space and time that preserves the stable steady states (42).We examine the outcomes of such a transformation for a simple system below in an effort to develop experimental predictions that are as specific as possible.
Given the correspondence between our results and continuous systems, it is natural to conjecture that transient perturbations may lead to irreversibility in stochastic models of gene regulation that account for the transcription, translation, and degradation (52)(53)(54)(55)(56)(57).This conjecture may be investigated by mapping the irreversible perturbations in our simulations into corresponding parameter changes in the stochastic models.

Comparing with bacterial differentiation and environmental perturbation
It is instructive to compare the irreversibility from transient genetic perturbations with bacterial differentiation processes, such as sporulation in B. subtilus (14), stalking in C. crescentus (13), and life-cycle stages in pathogenic E. coli (64).The results can also be compared with observations of bacterial heterogeneity such as persistence (65) across multiple bacterial species, the mucoid phenotype of P. aeruginosa (65), competence in B. subtilus (14), and metabolic shifts in E. coli in the case of the lac operon ( 15) and other inducible sugars (44).In these examples, genetically identical strains can exhibit diverse phenotypes due to environmental cues, and these phenotypes can persist hysteretically after the environmental cues are removed.While these examples involve environmental changes, our analysis shows that regulatory switching alone is sufficient to alter the state of positive feedback loops, resulting in irreversibility in the transcriptional state of the cell.
Conversely, by relating our analysis with these examples, it follows that the transcriptional irreversibility we characterize can give rise to large phenotypic changes, including morphological and behavioral ones.This transcriptional irreversibility may be leveraged to design synthetic bacterial circuits that generate irreversible changes, emulating recent developments in mammals (10).

Comparison with adaptive evolution responses
In the main text, we explore the question of how similar the set of irreversible response genes to the transient perturbation of crp KO is to the set of genes exhibiting altered expression in the adaptive evolution response to the same KO.In Tables S2 and S3, we list the identity of the genes with the largest shifts in expression (ln ⇢ u /h⇢i) when the adapted strain is cultivated in batch and chemostat conditions, respectively.To facilitate a comparison between our model and the data, we include the sign of regulation of crp ( mod u ), which is supposed to have the opposite sign as the shift in expression.As noted in the main text, 10 of 11 genes in batch cultivation and 33 of 42 genes in chemostat cultivation have changes in expression consistent with the regulatory model.The majority of these genes are found to respond irreversibly in our model.Of the genes that do not, most are directly regulated by crp and negatively autoregulated (malI, lsrR, glcC, prpR, rbsR, and glnG), two are negatively autoregulated but indirectly regulated by crp (metR and purR), and two are not regulated by crp (pdeL and cra).Changes in the first two cases may be due to constitutive activity of the gene promoters, which is not accounted for by our model.The final case may reflect changes to the intracellular environment not captured by the model.The sign of the gene expression responses to adaptive evolution are significantly concordant with those predicted by the model (see main text).In addition, the set of genes is enriched for irreversible response genes to crp.

Strategy to identify specific experimental conditions
Our Boolean framework, presented in the main text, fits into a broader strategy to identify specific conditions under which irreversibility occurs.An experiment to observe irreversibility requires the specification of (i) a candidate irreversible gene perturbation, (ii) an associated irreversible response gene, and (iii) specific cultivation conditions.Figure S5A  Read Archive (SRA) for E. coli experiments in which crp is perturbed.We found 16 instances of experiments characterizing crp downregulation either as a result of genetic perturbations or environmental shifts.Given this information, we examined the attractors that admit irreversibility of crp KO to determine whether they were similar to experimentally observed states, as described in the subsection on transcription-weighted attractor analysis.Briefly, predictions of irreversible response genes associated with attractors that were more similar to experimentally observed states were given higher weights than those that were more divergent.The outcome of the this analysis was a set of autoexcitatory genes that are themselves positively regulated by crp.This is similar to the three-gene schematic in Fig. 1, except that the two-gene positive feedback loop is reduced to a single autoexcitatory gene.Note that from the Boolean modeling, we already know that an AND operator is required for irreversibility (i.e., expression of both genes is required for activation).This requirement specifies the form of the differential equation model needed to model the candidate irreversible genes, as discussed in the subsection on transcriptional differential equation modeling.From these differential equations, it is possible to identify relative values of the parameters necessary for irreversibility to occur and thereby specify the cultivation conditions needed to observed irreversibility.

Transcription-weighted attractor analysis
Methods: We searched the GEO database (59) for all "Expression profiling by high-throughput sequencing" assays conducted in "Escherichia coli" on November 15, 2022, which yielded 495 datasets.Of these, we retained datasets that (i) had at least 10 RNA-seq samples and (ii) were conducted on non-enterohemorrhagic E.
coli, yielding 155 datasets with 5,147 gene expression profiles.We retrieved the raw RNA-seq data from SRA, aligned it to the MG1655 genome (NCBI Reference: NC_000913.3)using Rockhopper (66), and calculated the TPM.The TPM calculations excluded alignments to ribosomal RNA genes.
We next extracted the TPM counts for the 87 genes in the core network and converted them to log-scale using the formula ⇣ i = log 10 (z i + 10 10 ) + 10, where i is an index over genes.We subsequently identified 16 instances in which crp downregulation was measured within a dataset.For each case, we calculated the average expression of each gene before (t = O) and after (t = Q) crp downregulation.The average expression was binarized to 1 if it was larger than the gene's median in the entire dataset and 0 otherwise.Next, we compared the observed expression states before and after crp downregulation with the attractors generated in our Boolean model before and after crp KO.As in the Boolean modeling, we use x to represent the binarized expression obtained from data.Furthermore, we use x obs i and x att i to denote the attractors obtained from experimental observations and our Boolean modeling, respectively, and we define i to be the fraction of cases for which x i = 1 in the expression data.Then, the similarity between observed and modeled states is quantified using the Hamming similarity and likelihood of x obs i matching x att i if each x i is an independently assigned Bernoulli random variable with parameter i : The irreversible responses are averaged using Eqs.( 9) and ( 10) according to the following procedure: where j`= 1 if the `th gene responds irreversibly in attractor j and j`= 1 if the `th gene responds reversibly in attractor j.
(iii) For each instance of crp downregulation, indexed by k 2 {1, ..., 16}, fix x (k,O) and x (k,Q) and calculate the attractor weights Using these steps, we calculated the weighted probability of responding reversibly and irreversibly for each gene.
Results: Figure S6 summarizes the results of the probability for each gene to respond irreversibly when weighting the attractors by their similarity to observed transcriptional states.The horizontal axis indicates the fraction of crp KO for which each gene changed state between the initial attractor and the attractor reached after crp KO.Given that the gene changed, the vertical axis reports the fraction of instances that it remained altered in the final attractor.Comparing Fig. S6A to Fig. S6B reveals that the patterns of irreversibility as they relate to network structure are mostly preserved across the two weighting strategies.In particular, genes likely to respond irreversibly lie in the top right of each plot and are highlighted by a pink box.All such genes exhibit the key motif in which crp activates the target gene (X), which also activates itself.This result motivates an exploration of more granular differential equation models covered in the next section.
In the remaining sectors of Fig. S6, we see the irreversibility as it relates to other regulatory architectures.
In the lower right, we see a group of autorepressive genes activated by crp that turn off upon crp KO, but are restored with crp expression.Such genes would be candidates for negative controls in a high throughput experiment, since they are likely to respond, but not likely to be irreversible.Moreover, as the distance of the candidate irreversible response genes to crp increases, the probability of changing upon crp KO decreases.
This occurs because the response to the KO depends on a larger number of regulatory rules.At the same time, once a change occurs, the probability that it is irreversible becomes less concentrated around 0 and 1.This may be attributed to the larger number of ways to fit the candidate gene inside or downstream of a positive circuit for a given distance to crp, regulatory sign, and self-regulation.

Transcription differential equation modeling Analysis:
The key motif identified in Fig. S6 and the Boolean modeling together imply that irreversibility for the crp KO requires that both crp and X be present for X to be activated (i.e., an AND operator).To transform this Boolean logic into a set of continuous differential equations, we use the multivariate polynomial interpolation method (42) and make the following approximations: (i) transcription is fast relative to translation, (ii) dilution dominates protein degradation as a mechanism for protein loss, and (iii) the basal transcription rate of gene X is negligible.This yields the following system of equations for the expression of crp (C) and the downstream gene (X): where 1. ↵ is the basal transcription rate of crp,

2.
C and X are transcriptional activation parameters (resulting from transcription factor binding), 3. is the dilution rate, 4. ⌘ C and ⌘ X are Hill coefficients (which designate more switch-like behavior as they become larger), and 5. K C and K X are the concentrations for half-maximal activation.
These phenomenological parameters are measurable by targeted experiments, and as such, they may guide the choice of gene X.Here, we assume for simplicity that the impact of the basal transcription of X is negligible compared to the regulatory activation.This assumption does not impact the following analysis.
For the purposes of analyzing the parameters that lead to irreversibility, it is convenient to rescale the variables and parameters using Substitution of Eq. ( 13) into Eqs.( 11) and ( 12) yields the following rescaled equations where we have suppressed the dependence of ¯ X on C in the second equation.We note that these equations yield biologically meaningful solutions when C > 0 and X > 0, since these variables represent concentrations of proteins in the cell.In addition, the rate parameters ↵, ¯ C , ¯ X , and as well as the concentrations K C and K X are all larger than zero.
We proceed to analyze the fixed points of this system of equations and their stability.Because gene X does not regulate gene C, the existence and stability of fixed points for C will not depend on X.Furthermore, the results for Eq. ( 14) will be immediately applicable to Eq. ( 15), since C will be time-independent at a fixed point.For these reasons, we suppress the subscript C in the following analysis to simplify the notation, and we emphasize that C is interchangeable with X.
We proceed by setting f ( C) = 0 to obtain a polynomial equation for the fixed points of C: This equation has one or three solutions for biologically meaningful values of the parameters.By setting df /d C = 0, we obtain the condition When this equation is satisfied simultaneously with Eq. ( 16) a pitchfork bifurcation occurs at a critical value C⇤ .
This critical value may be obtained from the following steps: (i) adding C⌘ C to each side of Eq. ( 17), (ii) dividing the result by Eq. ( 16) to obtain a self-consistent equation for 1 + C⌘ C in terms of C, and (iii) using the self-consistency result to substitute out terms with C⌘ C in favor of C to obtain a quadratic equation in C.
The resulting equation is which has roots at For a given value of ⌘ > 1, there is a critical value of ↵ and ¯ where multistability arises.This occurs when f , df /d C, and d 2 f/d C2 all equal zero for the same value of C. Setting d 2 f/d C2 = 0, we obtain Notably, Ccrit is independent of ¯ .Substituting this value of C into df /d C = 0 yields ¯ , which reduces to after substituting in Eq. ( 20) for C. Substituting Eqs. ( 20) and ( 21) into Eq.( 16) results in With these critical values determined, we can determine the region of (↵, ¯ ,⌘)-space that allows for multiple stable fixed points.
Figure S7A illustrates the regions of parameter space for which the system is multistable.These regions are determined by fixing ⌘, calculating the critical point, then progressively increasing ¯ .For each value of Next, we solve for the concentrations C+ and C that maximize and minimize f ( C) by finding the roots of f 0 ( C).Then, we compute f ( C+ ) and f ( C ) with the adjusted value of ↵.This procedure yields the largest and smallest values of ↵ for this value of ¯ , which are Since only ↵ 0 are physically meaningful, we constrain ↵min 0. We also plot the critical points (determined from output Eq. ( 22) and Eq. ( 21)) parametrically as a function of ⌘ 2 [1,10].From this analysis, it is possible to identify parameters suitable for realizing irreversibility as envisioned by the Boolean model, as shown next.
Simulations: Figure S7B shows an example of a transient knockout of crp causing the irreversible response of gene X from simulation of Eqs. ( 14) and ( 15).The initial parameters for crp and the response gene are indicated in Fig. S7A by the bold green and purple circles, respectively (the colors correspond to the value of ⌘ in the legend).By Eq. ( 13), the regulatory activation for X already includes the impact of crp regulation.This choice of parameters has a single fixed point for C, which is highly expressed.The high expression of C coupled with the high initial expression of X cause the system to evolve toward a state with high concentrations of both genes.When crp is subsequently knocked out (implemented by decreasing ¯ to 0.1 as indicated by the faded green circle in Fig. S7A), the gene concentrations evolve toward a fixed point with both concentrations near their (low) ↵.Upon restoration of the regulatory activation of crp, its concentration increases.The concentration X, however, remains low because there is a fixed point for Eq. ( 15) with a low X.Although this is only one choice of parameters, it is clear that there many possible choices that yield concentration trajectories similar to those observed in Fig. S7B.

Examples of potential experiments
The results of the differential equation modeling help refine the selection of candidate irreversible perturbations and irreversible response genes by allowing us to evaluate their probability of exhibiting irreversibility using qualitative knowledge about the activation strength.From the transcriptional data we collected, it appears that crp has a fairly large constitutive activation since it remains expressed at detectable levels even when E. coli is cultivated with glucose as the primary carbon source (which is known to inhibit crp).A larger constitutive activation restricts the values of regulatory activation (and Hill coefficient) that allow crp to exhibit multistability (Fig. S7A).Since we want crp expression to be restored after the KO is removed, monostability of Eq. ( 14) is desirable so that C will spontaneously increase once ¯ is increased.Concerning the response gene X, irreversibility requires it to have parameters that admit multistability when crp is active.This is facilitated by smaller constitutive activations, larger regulatory activations, and larger Hill coefficients.In most cases, larger values of the regulatory activation and Hill coefficient for crp also facilitate multistability.Under these conditions, the drop in expression in crp due to the KO causes a drop in expression of the response gene and restoration of crp expression fails to restore expression of the response gene as explained in Fig. S7B.
From these considerations, one can imagine an experiment in which initially high crp expression is achieved by cultivating E. coli in glycerol.The regulated gene may also be manipulated to be highly expressed, for example by including zinc in the cultivation conditions to ensure that expression of zraR is initially large as well.Then, the reduction in regulatory activation of crp can be achieved by inducing CRISPR-interference.
Subsequently un-inducing CRISPR should allow the crp expression to recover.However, if the concentration of zinc is low enough, the regulatory activation of zraR can be tuned to be in the multistable region.Because zraR expression should be low during the induction of CRISPR, it should not be able to recover even though crp expression does.
One key assumption in Eqs.(11) and ( 12) is that the regulatory logic is qualitatively similar to a Boolean AND operator.Although this information is difficult to ascertain from the literature, it may be possible to infer cases in which the regulation is more likely to be AND based on the location of the transcription factor binding sites to DNA.If these sites are overlapping, then it is less likely that the regulatory logic is AND, whereas if CRP and the regulated protein form a dimer, the logic is more likely to be AND.If the regulatory rule is qualitatively an OR function, then Eq. ( 12) must be changed accordingly.Irreversibility may still be possible, but this would require employing an OE perturbation.
As shown in Fig. S7A, genes with stronger self-activation and larger Hill coefficients (i.e., more switchlike behavior) tend to have wider regions of multistability.Thus, the literature can be consulted to identify irreversible response gene candidates that possess these attributes.We also note that several genes may have activation parameters that can be altered via the inclusion of specific metabolites (e.g., zinc for the case of zraR above).The inclusion of these chemicals may be tuned to obtain a regulatory activation strength that admits multistability and to ensure that the response gene is initially ON.An analogous strategy has been previously employed to show multistability in inducible sugar utilization (44).
Finally, the response may be heterogeneous among cells.Such heterogeneity could occur, for example, as a result of the stochastic allocation of cellular components during cytokinesis.Since single-cell sequencing techniques in bacteria are still in their infancy, other approaches, like super-resolution fluorescent microscopy techniques, would be required to observe a heterogeneous response.The considerations in this section successfully narrow the scope of candidate irreversible perturbations and their associated response genes, but still leave substantial work to develop the required molecular biology tools implement the transient perturbation, ascertain the nature of the regulation, and detect the expression in time.Asterisks indicate parameters (r, s) for which the diffuse organization has a signficantly larger number of attractors as quantified by the Kruskal-Wallis test with a Bonferroni-corrected p-value < 0.004.The boxes, orange lines, upper whiskers, and lower whiskers denote the interquartile range, median, maximum (excluding fliers), and minimum, respectively.Fliers, which are data points that are more than 1.5 times the interquartile range above the median, are plotted separately.10)).In both panels, the genes in the box belong the key motif in which crp activates the gene and the gene X activates

Supplementary Tables
summarizes how the scope of potential experiments narrows.Then, the flow chart in Fig. S5B explains how the various levels of modeling (which themselves are based on different kinds of input data) narrow the scope.With (i)-(iii) specified, the designed experiment would employ CRISPR-interference to carry out the gene perturbation and quantitative polymerase chain reaction (qPCR) or RNA-seq to measure the expression irreversible response gene at different time points.Overview One major outcome of the Boolean modeling is the prominence of crp KO as a perturbation that leads to irreversibility.(The fact that targeted KOs have fewer externalities on the use of cellular resources than OEs encouraged us to focus specifically on transient crp KO.)This motivated us to examine the Sequencing

Fig. S1
Fig. S1 Number of attractors across realizations.Boxplots of the number of attractors across the M = 20 realizations compare the number under the diffuse (blue) and concentrated (green) rule organization.Asterisks indicate parameters (r, s) for which the diffuse organization has a signficantly larger number of attractors as quantified by the Kruskal-Wallis test with a Bonferroni-corrected p-value < 0.004.The boxes, orange lines, upper whiskers, and lower whiskers denote the interquartile range, median, maximum (excluding fliers), and minimum, respectively.Fliers, which are data points that are more than 1.5 times the interquartile range above the median, are plotted separately.

Fig. S2
Fig. S2 Convergence of irreversibility estimated from independent sets of realizations.(A) Boxplots of the RMSD of the irreversibility probability across min(Z(M, M 0 ), 1,000) independent pairs of realizations of the rules for the ascending input sorting (diffuse control).(B) Same quantities as in (A) for the descending input sorting (concentrated control).The values of the bias (s) and nestedness (r) parameters are indicated above each panel.Boxplot symbols retain their meanings from Fig. S1 (there are no fliers).

Fig. S3
Fig. S3 Comparison of irreversible response probability depending composition of circuits in the SCC containing the irreversible response gene.(A) Histograms of the irreversibility response probability across all network ensemble parameters (r, s) and perturbations of crp under the diffuse control scenario.Genes are divided into groups by whether they belong to an SCC with only positive circuits (red) or a mixture of positive and negative circuits (blue).(B) Same as (A), but for the concentrated control scenario.

Fig. S4
Fig. S4 Comparison of irreversibility when weighting the results by attractor basin size versus weighting them uniformly by attractor.(A) Scatter plot of the irreversibility of each gene averaged over perturbation types, rules realizations, and attractors in the diffuse control scenario (ascending input ordering).The gray dashed line is a linear fit to the data, whose slope indicates the quantitative change in irreversibility due to attractor weighting.The R 2 indicates the value of the coefficient of determination.(B) Same as (A), but for the concentrated control scenario.

Fig. S5
Fig. S5 Schematic for designing experiments to find irreversibility.(A) Diagram of how the scope is narrowed from all possible gene perturbations, irreversible response genes, and cultivation conditions to a manageable number of experimental predictions.(B) Analysis and input data needed to narrow the scope at each stage.

Fig. S6
Fig. S6 Probability of irreversible response after weighting attractors by their similarity to observed states.(A) Probability of responding irreversibly conditioned on a gene changing in response to crp KO.Probabilities indicated on the vertical axis are expressed as a fraction of the corresponding values indicated on the horizontal axis.Attractors are weighted proportionally to the exponential of the negative Hamming distance (see Eq. (9)).Node face colors correspond to the sign of crp regulation, node edge colors correspond to the sign of self-regulation, and opacities correspond the distance from crp as summarized in the at right.(B) Same plot as in (A), when attractors are weighted by likelihood (see Eq. (10)).In both panels, the genes in the box belong the key motif in which crp activates the gene and the gene X activates

Fig. S7
Fig. S7 Irreversibility in differential equation models of a the key motif in Fig. S6.(A) Multistability regions in the constituative activation-regulatory activation plane, color-coded by the values the Hill coefficient (⌘) indicated in the legend.Stars indicate the critical point (derived from Eqs. (21) and (22)) for each ⌘, while the gray dashed line shows the trajectory of the critical point as ⌘ is continuously increased from 1 to 10. Circles indicate exemplar parameters for the constitutive and regulatory activation crp (green, indicating ⌘ C = 2) and the response gene (purple, ⌘ X = 5) when crp active (bold) and knocked out (faded).(B) Concentrations of crp and the response gene prior to (teal), during (gray), after crp (pink), where arrows indicate the direction of time.During each the is integrated until the system reaches the stable state.The gray dotted line indicates the basin boundary (unstable fixed point in the response-gene concentration) when crp is active.
) With k still fixed, normalize the A jk to Ãjk = A jk / P |C on | j=1 A jk and compute the average probability of irreversible and reversible response weighted by the kth conditions, which are given by I k`= P |C on | j=1 Ãjk j`a nd R k`= P |C on | j=1 Ãjk j`, respectively.
(v) Compute hI `i = P 16 k=1 I k`a nd hR `i = P 16 k=1 R k`, the average of I k`a nd R k`, respectively, across conditions.

Table S1 Statistics of the origons with core size 30
.

Table S2 Irreversibility of genes with log-fold change
> 0.5 when

Table S3 Irreversibility of genes with log-fold change
> 0.5 when